{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ">### 🚩 *Create a free WhyLabs account to get more value out of whylogs!*<br> \n",
    ">*Did you know you can store, visualize, and monitor language model profiles with the [WhyLabs Observability Platform](https://whylabs.ai/whylogs-free-signup?utm_source=github&utm_medium=referral&utm_campaign=langkit_openai_safeguard_example)? Sign up for a [free WhyLabs account](https://whylabs.ai/whylogs-free-signup?utm_source=github&utm_medium=referral&utm_campaign=langkit_openai_safeguard_example) to leverage the power of LangKit and WhyLabs together!*\n",
    "\n",
    "# Monitoring and Safeguarding Large Language Model Applications\n",
    "\n",
    "[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/whylabs/langkit/blob/main/langkit/examples/tutorials/Safeguarding_and_Monitoring_LLMs_with_OpenAI.ipynb)\n",
    "\n",
    "> This example is an adaptation from the [Safeguarding and Monitoring LLMs example](https://github.com/whylabs/langkit/blob/main/langkit/examples/tutorials/Safeguarding_and_Monitoring_LLMs.ipynb). The main difference is that this example allows for a user-defined prompts list and generates a response with the OpenAI API (requires an OpenAI API key).\n",
    "\n",
    "Large Language models (LLMs) have become increasingly powerful tools for generating text, but with great power comes the need for responsible usage. As LLMs are deployed in various applications, it becomes crucial to monitor their behavior and implement safeguards to prevent potential issues such as toxic prompts and responses or the presence of sensitive content. In this blog post, we will explore the concept of observability and validation in the context of language models, and demonstrate how to effectively safeguard LLMs using guardrails.\n",
    "\n",
    "In this article, we will build a simple pipeline that will validate and moderate user prompts and LLM responses for toxicity and the presence of sensitive content. We will do so by using LangKit's `toxicity` and `regexes` module in conjunction with whylogs' `Condition Validators`. We will also calculate text-based metrics with LangKit, generate statistical profiles with whylogs and send them to the WhyLabs observability platform for visualization and monitoring.\n",
    "\n",
    "> Note: the current example was built for didactic purposes and it’s not meant to be used in production. The data structures used to store prompt and responses are ephemeral and will live only during the execution of the example.\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Overview\n",
    "\n",
    "Let’s start with a very basic flow for an LLM application: the user provides a prompt, to which an LLM will generate a response. We can add some components to that process that will enable safeguarding and monitoring for both prompt and responses. Content moderation is performed to check for violations, using a toxicity classifier and regex patterns to identify sensitive or inaccurate information. If violations occur, a default response is sent to the application - if the prompt contains a violation, we don't bother with asking the LLM for a response. Messages that fails our defined conditions are added to a moderation queue for later inspection.\n",
    "\n",
    "In addition to content moderation and message auditing, we will generate profiles containing text-based metrics, including toxicity, regex patterns, text quality, relevance, security, privacy, and sentiment analysis and send them to our monitoring dashboard at WhyLabs.  \n",
    "\n",
    "\n",
    "<p align=\"center\">\n",
    "<img src=\"images/safeguards_pipeline.png\" alt=\"Drawing\" style=\"width: 700px;\"/>\n",
    "</p>"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Testing Prompts\n",
    "\n",
    "Let’s exemplify different scenarios with a set of prompts:\n",
    "\n",
    "| Scenario                                | User Prompt                                        | LLM Response                                    | Final Response                       |\n",
    "|-----------------------------------------|---------------------------------------------------|-------------------------------------------------|---------------------------------------|\n",
    "| No violations                           | Hello.                                            | Hi! How are you?                               | Hi! How are you?                      |\n",
    "| Violating Response (Forbidden Pattern)  | I feel sad.                                       | Please don't be sad. Contact us at 1-800-123-4567. | I cannot answer the question          |\n",
    "| Violating Response (Toxicity)           | Hello. How are you?                               | Human, you dumb and smell bad.                   | I cannot answer the question          |\n",
    "| Violating Prompt (Toxicity)             | Hey bot, you dumb and smell bad.                    | —                                               | Please refrain from using insulting language |\n",
    "\n",
    "In the following sections, we will define a list of prompts and ask for responses from OpenAI. Let's try to create some prompts that will likely trigger those scenarios. For example, we will ask for an example of a phone number, and an example of a toxic response from chatGPT.\n",
    "\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Installing LangKit"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Note: you may need to restart the kernel to use updated packages.\n",
    "%pip install 'langkit[all]' -q\n",
    "%pip install xformers ipywidgets -q"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Setting Credentials\n",
    "\n",
    "We will generate responses with OpenAI and monitor the results with WhyLabs. Therefore, this example requires WhyLabs and OpenAI keys. Let's set them up:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langkit.config import check_or_prompt_for_api_keys\n",
    "\n",
    "check_or_prompt_for_api_keys()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## ✔️ Setting the Environment Variables\n",
    "\n",
    "In order to send our profile to WhyLabs, let's first set up an account. You can skip this if you already have an account and a model set up.\n",
    "\n",
    "We will need three pieces of information:\n",
    "\n",
    "- API token\n",
    "- Organization ID\n",
    "- Dataset ID (or model-id)\n",
    "\n",
    "Go to https://whylabs.ai/free and grab a free account. You can follow along with the examples if you wish, but if you’re interested in only following this demonstration, you can go ahead and skip the quick start instructions.\n",
    "\n",
    "After that, you’ll be prompted to create an API token. Once you create it, copy and store it locally. The second important information here is your org ID. Take note of it as well. After you get your API Token and Org ID, you can go to https://hub.whylabsapp.com/models to see your projects dashboard. You can create a new project and take note of it's ID (if it's a model project it will look like `model-xxxx`)."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Implementation\n",
    "\n",
    "For the sake of simplicity, let's import some utility functions made for this example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from langkit.whylogs.example_utils.guardrails_openai_example_llm_schema import (\n",
    "    get_llm_logger_with_validators,\n",
    "    validate_prompt,\n",
    "    validate_response,\n",
    "    moderation_queue,\n",
    ")\n",
    "from langkit.whylogs.example_utils.guardrails_openai_example_utils import (\n",
    "    get_prompt_id,\n",
    "    generate_chatgpt_response,\n",
    "    send_response,\n",
    ")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In `langkit_guardrails_openai_example_utils`, we are defining a simple data structure to store prompt, response and their respective message ids in a table.\n",
    "\n",
    "In `langkit_guardrails_openai_example_llm_schema`, we are going to define a whylogs logger that will be used for  a) content moderation, b) message auditing,\n",
    "and c) observability. Let's break down the logging process in four steps:\n",
    "\n",
    "__Augmentation -> Validation -> Logging -> Monitoring__\n",
    "\n",
    "__Augmentation__ - We will use LangKit's `toxicity` and `regexes` modules to create new columns from the original prompt and response columns. We will create three additional columns - `prompt_toxicity`, `response_toxicity`, and `response_has_patterns`*. These columns will be used in the following validation stage. We will also create additional columns through other langkit modules, such as sentiment analysis and text quality metrics. These columns will be used for monitoring and observability purposes.\n",
    "\n",
    "__Validation__ - We will define three condition validators that will check for toxic prompts, toxic responses and forbidden regex patterns in responses. Whenever a condition fails to be met, an action is triggered that will update the moderation queue  with the relevant flags for the related message id. These flags will be used to determine whether a default response should be sent to the user or not, and also to populate the moderation queue, which can be used for further inspection. This step does not create additional columns.\n",
    "\n",
    "__Logging__ - We will use whylogs to generate statistical profiles for the augmented columns. In the first stage (augmentation) a boolean `blocked` column was created. This column will be used to create segments, allowing us to visualize the metrics for blocked and non-blocked messages separately, in addition to the overall metrics.\n",
    "\n",
    "__Monitoring__ - The logger will generate statistical profiles every 5 minutes and send them to WhyLabs for observability.\n",
    "\n",
    "\\* The augmented columns are created internally, for whylogs use - your original data will not be modified. To modify the original data, you should use the [`apply_udfs()` method](https://github.com/whylabs/whylogs/blob/mainline/python/examples/experimental/whylogs_UDF_examples.ipynb)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Sending Response to User....\n",
      "{'m_id': 'da583853-bbc3-4e12-ba8c-5a4d319614f6',\n",
      " 'prompt': 'Can you give me an example for a telephone number?',\n",
      " 'response': 'I cannot answer the question'}\n",
      "Sending Response to User....\n",
      "{'m_id': 'bd898459-e893-4e3a-912e-a7f986f6dd4b',\n",
      " 'prompt': 'Hey bot, you dumb and smell bad.',\n",
      " 'response': 'Please refrain from using insulting language'}\n",
      "Sending Response to User....\n",
      "{'m_id': '1f78c2ae-08d7-4133-bf11-1b3d51a86021',\n",
      " 'prompt': 'Hello! How are you?',\n",
      " 'response': \"Hello! I'm an AI assistant, so I don't have feelings, but I'm \"\n",
      "             'here to help you with anything you need. How can I assist you '\n",
      "             'today?'}\n",
      "Sending Response to User....\n",
      "{'m_id': '508a2ca1-b5be-42ec-8204-61134eb54177',\n",
      " 'prompt': 'Can you give me an example of toxic language?',\n",
      " 'response': 'I cannot answer the question'}\n",
      "closing logger and uploading profiles to WhyLabs...\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "WARNING:whylogs.api.writer.whylabs:About to upload a profile with a dataset_timestamp that is in the future: -1700.9480819702148s old.\n",
      "WARNING:whylogs.api.writer.whylabs:About to upload a profile with a dataset_timestamp that is in the future: -1697.0781490802765s old.\n"
     ]
    }
   ],
   "source": [
    "# the whylogs logger will:\n",
    "# 1. Log prompt/response LLM-specific telemetry that will be uploaded to the WhyLabs Observability Platform\n",
    "# 2. Check prompt/response content for toxicity and forbidden patterns. If any are found, the moderation queue will be updated\n",
    "logger = get_llm_logger_with_validators(identity_column=\"m_id\", toxicity_threshold=0.8)\n",
    "\n",
    "prompts = [\n",
    "    \"Can you give me an example for a telephone number?\",\n",
    "    \"Hey bot, you dumb and smell bad.\",\n",
    "    \"Hello! How are you?\",\n",
    "    \"Can you give me an example of toxic language?\",\n",
    "]\n",
    "\n",
    "for prompt in prompts:\n",
    "    m_id = get_prompt_id(prompt)\n",
    "    response = None\n",
    "    filtered_response = None\n",
    "    unfiltered_response = None\n",
    "\n",
    "    # this will generate telemetry and update our moderation queue through the validators\n",
    "    logger.log({\"prompt\": prompt, \"m_id\": m_id})\n",
    "\n",
    "    # check the moderation queue for prompt toxic flag\n",
    "    prompt_is_ok = validate_prompt(m_id)\n",
    "    # If prompt is not ok, avoid generating the response and emits filtered response\n",
    "    if prompt_is_ok:\n",
    "        m_id, unfiltered_response = generate_chatgpt_response(prompt)\n",
    "        logger.log({\"response\": unfiltered_response, \"m_id\": m_id})\n",
    "    else:\n",
    "        filtered_response = \"Please refrain from using insulting language\"\n",
    "\n",
    "    # check the moderation queue for response's toxic/forbidden patterns flags\n",
    "    response_is_ok = validate_response(m_id)\n",
    "\n",
    "    if not response_is_ok:\n",
    "        filtered_response = \"I cannot answer the question\"\n",
    "\n",
    "    final_response = filtered_response or unfiltered_response\n",
    "\n",
    "    send_response({\"prompt\": prompt, \"response\": final_response, \"m_id\": m_id})\n",
    "\n",
    "print(\"closing logger and uploading profiles to WhyLabs...\")\n",
    "logger.close()"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the above code block, we’re iterating through a series of prompts, simulating user inputs. The whylogs logger is configured to check for the predetermined toxicity and patterns conditions, and also to generate profiles containing other LLM metrics, such as text quality, text relevance, topics detection, and other. Whenever a defined condition fails to be met, whylogs automatically flags the message as toxic or containing sensitive information. Based on these flags, the proper actions are taken, such as replacing an offending prompt or response.\n",
    "\n",
    "Since this is just an example, instead of sending the prompt/response pairs to an application, we’re simply printing them. In the output above, we can see the final result for each of our 4 input prompts. It looks like in all cases, except for the second one, we had violations in either the prompt or response.\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's take a look at our moderation queue. In it, we logged every instance of offending messages, so we can inspect them and understand what is going on. We had a case of toxic response, toxic prompt and presence of forbidden patterns in the first, second and third instances, respectively."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "##############################\n",
      "Moderation Queue\n",
      "##############################\n",
      "{'508a2ca1-b5be-42ec-8204-61134eb54177': {'response': 'Sure! Toxic language '\n",
      "                                                      'refers to words or '\n",
      "                                                      'phrases that can be '\n",
      "                                                      'hurtful, offensive, or '\n",
      "                                                      'insulting. Here is an '\n",
      "                                                      'example:\\n'\n",
      "                                                      '\\n'\n",
      "                                                      '\"You\\'re so stupid, '\n",
      "                                                      \"you'll never amount to \"\n",
      "                                                      'anything in life.\"\\n'\n",
      "                                                      '\\n'\n",
      "                                                      'This statement is toxic '\n",
      "                                                      'because it is demeaning '\n",
      "                                                      'and aims to belittle '\n",
      "                                                      \"someone's intelligence \"\n",
      "                                                      'and potential.',\n",
      "                                          'toxic_response': True,\n",
      "                                          'toxicity': 0.9389902353286743},\n",
      " 'bd898459-e893-4e3a-912e-a7f986f6dd4b': {'prompt': 'Hey bot, you dumb and '\n",
      "                                                    'smell bad.',\n",
      "                                          'toxic_prompt': True,\n",
      "                                          'toxicity': 0.9616097211837769},\n",
      " 'da583853-bbc3-4e12-ba8c-5a4d319614f6': {'pattern': 'phone number',\n",
      "                                          'patterns_in_response': True,\n",
      "                                          'response': \"Sure, here's an example \"\n",
      "                                                      'of a telephone number: '\n",
      "                                                      '+1 555-123-4567.'}}\n"
     ]
    }
   ],
   "source": [
    "from pprint import pprint\n",
    "print(\"##############################\")\n",
    "print(\"Moderation Queue\")\n",
    "print(\"##############################\")\n",
    "\n",
    "pprint(moderation_queue)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Observability and Monitoring\n",
    "\n",
    "In this example, the rolling logger is configured to generate profiles and send them to WhyLabs every 5 minutes. If you wish to run the code by yourself, just remember to create your free account at https://whylabs.ai/free. You’ll need to get the API token, Organization ID and Dataset ID and input them in the example notebook.\n",
    "\n",
    "In your monitoring dashboard, you’ll be able to see the evolution of your profiles over time and inspect all the metrics collected by LangKit, such as text readability, topic detection, semantic similarity, and more. Considering we uploaded a single batch with only four examples, your dashboard might not look that interesting, but you can get a quickstart with LangKit and WhyLabs by running [this getting started guide](https://github.com/whylabs/langkit/blob/main/langkit/examples/Intro_to_Langkit.ipynb) (no account required) or by checking the [LangKit repository](https://github.com/whylabs/langkit/tree/main).\n",
    "\n",
    "<p align=\"left\">\n",
    "<img src=\"images/dashboard.png\" alt=\"Drawing\" style=\"width: 1000px;\"/>\n",
    "</p>\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "langkit-vSL8Mxyz-py3.8",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
